Automated Syllabus of Computer Science Papers

Built by Rex W. Douglass @RexDouglass ; Github ; LinkedIn

Papers curated by hand, summaries and taxonomy written by LLMs.

Submit paper to add for review

Optimizing Model Selection and Robustness Techniques

> Model Stability and Robustness

>> Model Selection Stability
  • Prioritize stability when choosing a model selection procedure, as unstable procedures can lead to large predictive losses due to sensitivity to small changes in the data. (Breiman 1996)
>> Leveraging Multi-View, Multiclass, and Multi-Task Approaches
  • Consider using a Bayesian undirected graphical model for co-training, which provides a principled approach for semi-supervised multi-view learning by making explicit the assumptions of co-training algorithms and offering a novel co-training kernel for Gaussian processes classifiers that avoids local maxima problems and estimates the reliability of each view. (Blum and Mitchell 1998)
>> Balancing Interpretability, Predictive Power, and Communication Constraints
  • Consider using Kernel Regularized Least Squares (KRLS) for social science modeling and inference problems, as it combines the benefits of traditional generalized linear models (GLMs) with the flexibility and interpretability of machine learning methods, allowing for the estimation of complex, nonlinear relationships while still providing interpretable results. (Hainmueller and Hazlett 2014)

> Sparse Regularization Methods for Complex Data Structures

>> Sparsity Induction via Group Lasso & Tree-based Methods
  • Consider using a hierarchical group-lasso regularization method when estimating pairwise interactions in linear or logistic regression models, as it ensures that whenever an interaction is estimated to be nonzero, both its associated main effects are also included in the model, leading to interpretable interaction models. (Lim and Hastie 2015)

> Optimal Error Approximation & Regularization Parameter Selection

>> Convex Risk Minimization for Classification Accuracy
  • Consider using convex risk minimization techniques, such as support vector machines or AdaBoost, to approximate the optimal Bayes error rate in classification tasks, as these methods can lead to consistent estimates and perform well compared to traditional maximum likelihood approaches. (Zhang 2004)

  • Consider using convex risk minimization techniques, such as support vector machines or AdaBoost, to approximate the optimal Bayes error rate in classification tasks, as these methods can lead to consistent estimates and perform well compared to traditional maximum likelihood approaches. (NA?)

>> Regularized Estimation & Probability Bounds for Misclassification
  • Choose an estimator for solving linear inverse problems by minimizing a certain function, which is a measure of distance from a prior guess if one exists, while satisfying the postulates of regularity, locality, and composition consistency. (NA?)
>> Optimal Margin Robustness via ERM Classifiers
  • Use Empirical Risk Minimization (ERM) classifiers, specifically those based on finite sieves that do not rely on knowledge of the margin parameter, to achieve robustness to the margin and attain optimal rates in statistical learning problems involving massive sets with complex boundaries. (Stevens 1946)

References

Blum, Avrim, and Tom Mitchell. 1998. “Combining Labeled and Unlabeled Data with Co-Training.” Proceedings of the Eleventh Annual Conference on Computational Learning Theory, July. https://doi.org/10.1145/279943.279962.
Breiman, Leo. 1996. “Heuristics of Instability and Stabilization in Model Selection.” The Annals of Statistics 24 (December). https://doi.org/10.1214/aos/1032181158.
Hainmueller, Jens, and Chad Hazlett. 2014. “Kernel Regularized Least Squares: Reducing Misspecification Bias with a Flexible and Interpretable Machine Learning Approach.” Political Analysis 22. https://doi.org/10.1093/pan/mpt019.
Lim, Michael, and Trevor Hastie. 2015. “Learning Interactions via Hierarchical Group-Lasso Regularization.” Journal of Computational and Graphical Statistics 24 (July). https://doi.org/10.1080/10618600.2014.938812.
Stevens, S. S. 1946. “On the Theory of Scales of Measurement.” Science 103 (June). https://doi.org/10.1126/science.103.2684.677.
Zhang, Tong. 2004. “Statistical Behavior and Consistency of Classification Methods Based on Convex Risk Minimization.” The Annals of Statistics 32 (February). https://doi.org/10.1214/aos/1079120130.